Learning Bias and Phonological-Rule Induction

نویسندگان

  • Daniel Gildea
  • Daniel Jurafsky
چکیده

A fundamental debate in the machine learning of language has been the role of prior knowledge in the learning process. Purely nativist approaches, such as the Principles and Parameters model, build parameterized linguistic generalizations directly into the learning system. Purely empirical approaches use a general, domain-independent learning rule (Error Back-Propagation, InstanceBased Generalization, MinimumDescription Length) to learn linguistic generalizations directly from the data. In this paper we suggest that an alternative to the purely nativist or purely empiricist learning paradigms is to represent the prior knowledge of language as a set of abstract learning biases,which guide an empirical inductive learning algorithm.We test our idea by examining the machine learning of simple SoundPattern ofEnglish (SPE)-style phonological rules.We represent phonological rules as finite state transducers which accept underlying forms as input and generate surface forms as output.We show thatOSTIA, a general-purpose transducer induction algorithm, was incapable of learning simple phonological rules likeflapping.We then augmentedOSTIAwith three kinds of learning biases which are specific to natural language phonology, and are assumed explicitly or implicitly by every theory of phonology: Faithfulness (underlying segments tend to be realized similarly on the surface), Community (Similar segments behave similarly), and Context (Phonological rules need access to variables in their context). These biases are so fundamental to generative phonology that they are left implicit in many theories. But explicitly modifying the OSTIA algorithmwith these biases allowed it to learn more compact, accurate, and general transducers, and our implementation successfully learns a number of rules from English and German. Furthermore, we show that some of the remaining errors in our augmented model are due to implicit biases in the traditional SPE-style rewrite system which are not similarly represented in the transducer formalism, suggesting that while transducers may be formally equivalent to SPE-style rules, they may not have identical evaluation procedures. Our algorithm is not intended as a cognitive model of human learning; but it is intended to suggest the kind of biases which may be added to empiricist induction models to build a cognitively and computationally plausible learning model for phonological rules.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning natural and unnatural phonological stress by 9- and 10-year-olds: A preliminary report

Research into adult learning of natural and unnatural pairs of artificial languages have demonstrated that it is easier to learn a phonological rule that is based on naturalness in language than a similar, but unnatural, version of the same rule (Wilson 2006, Carpenter 2010). Infants' learning of natural and unnatural phonology has produced mixed results (Gerken and Boltt, 2008; Seidl and Buckl...

متن کامل

Unsupervised Discovery of Phonological Categories through Supervised Learning of Morphological Rules

We describe a case study in the application of symbolic machine learning techniques for the discovery of linguistic rules and categories. A supervised rule induction algorithm is used to learn to predict the correct diminutive suffix given the phonological representation of Dutch nouns. The system produces rules which are comparable to rules proposed by linguists. Furthermore, in the process of...

متن کامل

Automatic Induction of Finite State Transducers for Simple Phonological Rules

This paper presents a method for learning phonological rules from sample pairs of underlying and surface forms, without negative evidence. The learned rules are represented as finite state transducers that accept underlying forms as input and generate surface forms as output. The algorithm for learning them is an extension of the OSTIA algorithm for learning general subsequential finite state t...

متن کامل

Learning with Globally Predictive Tests 1 Learning with Globally Predictive Tests

We introduce a new bias for rule learning systems. The bias only allows a rule learner to create a rule that predicts class membership if each test of the rule in isolation is predictive of that class. Although the primary motivation for the bias is to improve the understandability of rules, we show that it also improves the accuracy of learned models on a number of problems. We also introduce ...

متن کامل

Word clustering effect on vocabulary learning of EFL learners: A case of semantic versus phonological clustering

The aim of this study is to determine the effect of word clustering method on vocabulary learning of Iranian EFL learners through a case of semantic versus phonological clustering. To this effect, 80 homogeneous students from four intermediate classes at an English institute in Torbat e Heydariyeh participated in this research. They were assigned to four groups according to semantic versus phon...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Linguistics

دوره 22  شماره 

صفحات  -

تاریخ انتشار 1996